- 
                Notifications
    You must be signed in to change notification settings 
- Fork 184
cmd-diff: a few misc enhancements #4253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd-diff: a few misc enhancements #4253
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces several enhancements to cmd-diff, including new --metal-ls and --metal-du differs and a --difftool option to use an external diff tool like vimdiff. The refactoring to support the new differs is well-executed, with shared logic extracted into a reusable generator.
My review focuses on a couple of areas for improvement. The implementation of --difftool relies on a global variable, which can affect maintainability. I've suggested alternatives to make this dependency explicit. I also found a minor issue with a superfluous attribute access in the diff_cmd_outputs function. Overall, these are good additions to the tool.
| I tried to build cosa with the patch on aarch64 and failed, not sure if I missed something.  | 
| 
 Potentially running out of memory? Or maybe the machine you are on doesn't support virtualization (i.e. does  | 
| 
 Make sure to also:  | 
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I like new differs.
        
          
                src/cmd-diff
              
                Outdated
          
        
      | c = list(cmd) | ||
| if '{}' not in c: | ||
| c += ['{}'] | ||
| idx = c.index('{}') | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be in the loop. Probably cleaner to still clone the list for each iteration though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. it doesn't need to be in the loop but keeping the code grouped together with the strategy was nice rather than having multiple if strategy= branches.
        
          
                src/cmd-diff
              
                Outdated
          
        
      |  | ||
| def diff_cmd_outputs(cmd, path_from, path_to): | ||
| def diff_cmd_outputs(cmd, path_from, path_to, strategy='template'): | ||
| workingdir = os.getcwd() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor/optional: in situations like this, I prefer to use the signature default so that it's equivalent to not passing an argument at all in the default case. Often (like here), that's None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e.
| workingdir = os.getcwd() | |
| workingdir = None | 
? I can make that change if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be resolved in the next upload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be resolved now
        
          
                src/cmd-diff
              
                Outdated
          
        
      |  | ||
|  | ||
| def diff_cmd_outputs(cmd, path_from, path_to): | ||
| def diff_cmd_outputs(cmd, path_from, path_to, strategy='template'): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels a bit weird to use a string for this. Couldn't it just be a boolean?
That avoids the assert also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of liked leaving this open to different strategies in the future and also being more explicit in the naming.
i.e. what would the boolean argument's name be? use_cd_not_template=False ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just chdir?
Or we could make it into a proper enum too like OSTreeImport if we want to keep this structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make it into a ENUM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be resolved now
| kustomize | ||
|  | ||
| # For vimdiff | ||
| vim-enhanced | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how can we structure this so that this functionality is not dependent on the $preferred_tool shipping in cosa? One option is that cosa diff --difftool in that case just outputs the command and you run it.
Or... it could also do something like systemd-run --user -t --same-dir --wait --collect git difftool ... maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I like git-delta so being able to plug my own tool would be nice. How about we juste write the git diff output to a file and then we can invoke the tool we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cosa diff --difftoolin that case just outputs the command and you run it.
Only thing about this is it implies not cleaning up after itself.
Or... it could also do something like systemd-run --user -t --same-dir --wait --collect git difftool ... maybe.
This implies functionality we don't necessarily have today. i.e. in the run via bash function case where we run inside the COSA container, we don't have things mounted into the container that would allow it to run a systemd unit on the host.
It definitely sucks installing this into the COSA container too, but it was the easiest thing to do. We don't have to do that, i'd just continue to do what I've been doing, which is cosa shell and then sudo dnf install vim before running cosa diff --difftool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only thing about this is it implies not cleaning up after itself.
Indeed. I think it's OK in that case to tell the user they're responsible for cleanup.
This implies functionality we don't necessarily have today. i.e. in the run via bash function case where we run inside the COSA container, we don't have things mounted into the container that would allow it to run a systemd unit on the host.
Right yeah, the idea would be to add another option to the alias.
It definitely sucks installing this into the COSA container too, but it was the easiest thing to do. We don't have to do that, i'd just continue to do what I've been doing, which is cosa shell and then sudo dnf install vim before running cosa diff --difftool
Not strongly against to be clear, but I think it'd be a larger win to try to make this work for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a blocker I can remove the part in src/deps.txt that installs vim.
| # Now that the mounts are live, we can diff them | ||
| git_diff(mount_dir_from, mount_dir_to) | ||
| # Allow the caller to operate on these values | ||
| yield mount_dir_from, mount_dir_to | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a more appropriate model for the single yield case be to just take a function as argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess. Like a callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into this a little, but time boxed myself on it and ran out of time. Can we leave it and improve later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I mean something like
def diff_metal_ls(diff_from, diff_to):
	cmd = ['find', '.']
	def differ(mount_dir_from, mount_dir_to):
		diff_cmd_outputs(cmd, mount_dir_from, mount_dir_to, strategy='cd')
	diff_metal_helper(diff_from, diff_to, differ)or did that not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly can't remember what I tried at this point. It looks like this will change soon anyway with #4333 so I'd prefer not to rework it now if I don't have to.
| Sorry for the late reply, test on x86_64, and it works.  | 
| 
 The answer to this is nuanced. We could potentially add a pipe here to sort the output, but with  When I was doing this I'd do just that. I'd pipe the left and right panes through  I'd also do things like remove the unique hashes from the output so that the diffs were normalized (i.e. removing unique things that aren't important to the diff). So I view this as something that can be solved by using a powerful difftool rather than having to bake all that functionality directly into  | 
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
f034c31    to
    537465a      
    Compare
  
    | rebased this (but I didn't do anything other than that). Given my inability to get back to this due to other high priority things would it be preferable for me to move this to draft OR to get it in and worry about the suggestions for improvements later? | 
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
b24601f    to
    da70490      
    Compare
  
    | ok. this is now updated with a much more lightweight  | 
4caa5e8    to
    c712d61      
    Compare
  
    It had some common elements so let's use a loop.
Since we could be operating on directories or files change file_from -> path_from and file_to -> path_to. Also change the temporary output filenames to more properly indicate they are outputs.
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
c712d61    to
    a6b67c8      
    Compare
  
    | rebased on top of latest main and now the golang-ci-lint should pass! | 
This strategy simply changes directory into the given path before
running the provided command rather than replacing a templated `{}`
with the path.
Useful for commands that operate more cleanly when operated on in
the directory where you want the operation to occur.
    The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
a6b67c8    to
    671320f      
    Compare
  
    Having vimdiff is a lot more powerful to me because you can manually do a few things in the terminal to massage the files on each side of the diff to give you more information (i.e. narrow in on exactly what diff you are interested in). Let's add a --difftool boolean flag to trigger `git difftool`, which will allow diffs to be displayed using vimdiff. Additionally include vim-enhanced so we have `vimdiff` installed and at our disposal.
This way we can have more commands that can leverae this code for different "diffs" on the resulting mounted filesystems. Prep for a future commit.
A few more views of differences between two metal disk image mounted filesystems.
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> coreos#4253 (comment)
This way if it's not specified some sane value is used.
This basically maps to passing `--format=json` to `rpm-ostree db diff`. Useful for updating meta.json with info about package differences and advisories.
Similar to get_build_meta(), let's make a convenience function for grabbing commitmeta.
This set of function names leverages rpm-ostree for the diffing.
This adds two new arguments for generating rpm-ostree db diff compatible output by just leveraging the information in the commitmeta.json files in each build's directory. This is a much more lightweight way to get diffs rather than having to download and import the entire ociarchive, which we are now having to do in the build-with-buildah world where the rpm information isn't available to us in the OSTree commit object like it was in the past. This was assisted by Google/Gemini and got me 75% of the way there. The prompt provided: ``` Look at the files old-commitmeta.json and new-commitmeta.json and generate a two functions in src/cmd-diff that will produce similar diff output. The first function will produce human readable diff output and should be in the form as shown in the human-diff.txt file. The second function should produce JSON output and be in the format as shown in the json-diff.txt file. The commitmeta.json files will exist in the build directory for a build (so when diffing two builds you'll need to look at the commitmeta.json for the first build and then the commitmeta.json for the second build and then produce the diff output based on the differences in those two files. ``` Assisted-By: <google/gemini-2.5-pro>
Since this version is more lightweight (doesn't require OSTree import) let's make is the primary one that gets called when someone does `cosa diff --rpms`.
671320f    to
    5ba9aac      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment, but otherwise LGTM! Nice work on the RPM diff.
| args.diff_to = latest_build | ||
| elif args.diff_from is None: | ||
| args.diff_from = latest_build | ||
| args.diff_from = builds.get_previous() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what were you trying to do that surprised you and motivated this?
E.g. if I do cosa diff --to=$build I expect the from to be the latest build. This matches git diff semantics. Having it actually diff against the previous build is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the last commit in #4327
Basically I wanted to be able to not have to specify the previous build in cmd-build-with-buildah and have cosa diff detect that for me.
It also enables things like being able to run cosa diff --rpms without having to specify a --to and a --from at all, like rpm-ostree db diff does if you don't provide commits to it.
Maybe the behavior should be "if to is latest_build then from will default to previous_build, otherwise default to latest_build".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also enables things like being able to run cosa diff --rpms without having to specify a --to and a --from at all, like rpm-ostree db diff does if you don't provide commits to it.
Hmm, that's already the case today, no?
ISTM like in the last commit of #4327, we can just not pass in --to or --from at all in the default case. If there's a parent build, just passing --from should do the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's already the case today, no?
You are right. My mistake. I didn't see the opportunity to not pass in the --to there.
Reverted in #4327
The .stdout at the end of this line is unnecessary and has no effect. When stdout is redirected to a file object, the stdout attribute of the returned CompletedProcess object is None. Assisted-By <gemini-code-assist> #4253 (comment)
Turns out this behavior wasn't really required and there was some preference to leave it the other way [1]. This reverts commit 9ffb64f. [1] coreos#4253 (comment)
Turns out this behavior wasn't really required and there was some preference to leave it the other way [1]. This reverts commit 9ffb64f. [1] #4253 (comment)
See individual commit messages.
We now have a
--metal-lsand--metal-duand also a--difftoolto tellcosa diffto output usinggit difftoolversusgit diffso we can usevimdiff.